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AMENDMENT TO THE CLAIMS 

1 . (Currently Amended) A method for iden t ifying the misuse 
of authorized access to a digital da t a gathering system by a user, comprising The 
method according to Claim 28. further comprising : 

a) constructing a user cluster index for [[a]] the user of a digital data 
gathering system ; 

wherein the user cluster index comprises a list of families of data to 
which data from the digital data gathering results of the user were categorized; 

b) monitoring families of the further digital data gathering results 
of the user; and 

c) comparing the families of the further digital data gathering 
results of the user to the user cluster index to determine anomalies in the digital data 
gathering results ; and 

d) identifying a po t ential misuse when an anomaly is de t ected . 

2. (Currently Amended) The method for identifying t he 
misuse of authorized access to a digi t al da t a gathering system by a use r according to 
Claim 1, further comprising: 

a) comparing the anomalies to the user cluster index to determine 
the ratio of anomalies to existing clusters; and 

b) reporting a potential misuse when the ratio exceeds a 
predetennined threshold. 
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3. (Currently Amended) The method for identifying t he 
misuse of au t horized access to a digi t al da t a gathering system according to Claim 1 , 
further comprising: 

a) monitoring digital data gathering results of the use r ; 

b) constructing a user lexicon for [[a]] the user of a digital da t a 
ga t hering system ; 

wherein the user lexicon comprises a list of words or phrases gathered 
from documents of the digital data gathering results of the user; and 

c) comparing words or phrases gathered from the documen t s of the 
further digital data gathering results to the user lexicon to determine anomalies in the 
digital data gathering results^-and 

d) identifying a potential misuse when an anomaly is detected . 

4. (Currently Amended) The method for identifying t he 
misuse of authorized access to a digi t al data gathering system according to Claim 3, 
furthe r comprising : 

a) moni t oring digi t al data gathering queries of the user; 

b) and wherein the user lexicon further comprises a list of words or 

phrases gathered from the monitoring of the queries; and 

c) comparing the further content of the further query queries of t he 
user to the user lexicon to determine any anomaly by the further query anomalies in 
the queries; and 

d) identifying a po t ential misuse when an anomaly is detected . 
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5. (Currently Amended) The method for identifying the 
misuse of authorized access to a digi t al da t a gathering system by a user according to 
Claim 3, further comprising: 

a) determining a ratio of anomalies to words or phrases in the 

lexicon; and 

b) reporting a potential misuse when the ratio exceeds a 
predetermined threshold. 

6. (Currently Amended) The method for identifying the 
misuse of authorized access to a digital da t a gathering system by a user according to 
Claim 3, wherein the user lexicon comprises a list of words or word strings 
identifying particular words or types of words, or both, extracted from documents 
retumed in response to user queries. 

7. (Currently Amended) The method fbi — identifying t he 
misuse of au t horized access to a digi t al data gathering sys t em according to Claim 1, 
further comprising: 

a) constructing a structured data profile for [[a]] the user of a digital 
data ga t hering sys t em ; 

b) wherein the structured data profile comprises a list of data 
identifying workplace employment characteristics of the user; 

c) comparing the further digital data gathering results of the user 
to the structured data profile to determine whether the further digital data gathering 
results are congment with the stmctured data profile; and 
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d) identifying a potential misuse when the digital data gathering 
results are not congruent with the structured data profile. 

8. (Currently Amended) The method for iden t ifying the 
misuse of authorized access to a digital data gathering system according to Claim 7, 
fui"ther comprising : 

a) wherein the structured data profile comprising comprises a 
structured data profile lexicon of terms and phrases indicating valid user activityrand 

b) identifying a potential misuse when the digital data gathering 

results are not congmcn t with the structured data profile . 

9. (Currently Amended) The method for iden t ifying t he 
misuse of authorized access to a digital data gathering system according to Claim 3, 
further comprising: 

a) constructing a structured data profile for [ [a]] the user of a digi t al 
data gathering sys t em ; 

b) wherein the structured data profile comprises a list of data 
identifying workplace employment characteristics of the user; 

c) comparing the further digital data gathering results of the user 
to the structured data profile to determine whether the further digital data gathering 
results are congruent with the stmctured data profile; and 

d) identifying a potential misuse when the further digital data 
gathering results are not congruent with the structured data profile. 
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10. (Currently Amended) A method for identifying the misuse 
of authorized access to a digital data gathering system by a user, comprising: 

a) monitoring a content of digital data gathering results of the user^ 
wherein the content includes at least one of words and phrases : 

b) constructing a user lexicon for a user of a digital data gathering 

system; 

wherein the user lexicon comprises a list of at least some of the words 
or phrases gathered from documents of the digital data gathering results of the user; 

monitoring a further content of further digital data gathering results 
obtained by the user: 

c) comparing words or phrases of the further content gathered from 
the documents of the digital da t a gathering results to the user lexicon to determine 
anomalies in the further digital data gathering results; and 

d) identifying a potential misuse when an anomaly is detected. 

1 1 . (Currently Amended) The method for — identifying the 
misuse of authorized access to a digi t al data gathering sys t em by a user according to 
Claim 10, further comprising: 

a) determining a ratio of anomalies to the words or phrases in the 

lexicon; and 

b) reporting a potential misuse when the ratio exceeds a 
predetermined threshold. 
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12. (Currently Amended) The method for identifying the 
misuse of authorized access to a digital da t a gathering sys t em by a user according to 
Claim 10, wherein the user lexicon comprises a list of words or word strings 
identifying nouns extracted from documents returned in response to user queries. 

13. (Currently Amended) The method for identifying the 
misuse of authorized access to a digital da t a gathering system according to Claim 10, 
further comprising: 

a) constructing a stmctured data profile for a user of a digital data 
gathering system; 

b) wherein the structured data profile comprises a list of data 
identifying workplace employment characteristics of the user; 

c) comparing the further digital data gathering results of the user 
to the stmctured data profile to determine whether the further digital data gathering 
results are congruent with the stmctured data profile; and 

d) identifying a potential misuse when the further digital data 
gathering results are not congruent with the stmctured data profile. 
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14. (Currently Amended) A method for identifying the misuse 
of authorized access to a digital data gathering system by a user, comprising: 

a) constructing a stmctured data profile for a user of a digital data 
gathering system; 

b) wherein the structured data profile comprises a list of data 
identifying workplace characteris t ics employment information of the user; 

v) monitoring digital data gathering results of the user; 

d) comparing digital data gathering results of the user to the 
stmctured data profile to detennine whether the digital data gathering results are 
congruent with the structured data profile; and 

e) identifying a potential misuse when the digital data gathering 
results are not congruent with the structured data profile. 

Claims 15-16 (Canceled) 

17. (Currently Amended) The method for iden t ifying the 
misuse of authorized access t o an information r et r ieval system according to Claim 
[[ 1 5]] 29, further comprising: weighting potential misuses anomalies identified from 
according to the user lexicon, the user cluster index, and the stmctured data profile to 
determine a report of potential misuse. 
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18. (Currently Amended) The method for iden t ifying the 
misuse of authorized access to an infomia t ion retrieval system according to Claim 
[[15]] 29, further comprising: sending a notification of potential misuse when a 
potential misuse an anomaly is identified from according to two or more of the user 
lexicon, the user cluster index, and the structured data profile. 

19. (Currently Amended) The method for identifying the 
misuse of authorized access t o an informa t ion retrieval system according to Claim 
[[15]] 29, wherein the user lexicon comprises a list of words or phrases gathered from 
metadata of documents returned in the query results. 

20. (Currently Amended) The method for identifying t he 
misuse of authorized access t o an informa t ion retri e val system according to Claim 
[[15]] 29, wherein the user lexicon comprises a list of words, or types of words, or 
both, extracted from documents returned in the query results. 

21. (Currently Amended) The method for iden t ifying the 
misuse of authorized access to an informa t ion ret r ieval system according to Claim 
[[15]] 29, wherein the user cluster index comprises a list of families of topic data to 
which the data of the user information retrieval results have been categorized. 
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22 . (Currently Amended) A method for detecting misuse by a 
user of an information retrieval system having a document collection, wherein 
documents of the document collection are categorized into one of a plurality of 
clusters according to topic, the method comprising the s t eps of : 

a) pre-clustering the documen t collec t ion; 

b) tracking the one of the plurality of clusters cluste r from which 
any document read by the user originates; 

c) building up a profile of use for the user based on most frequently 
accessed clusters over a time sufficient to establish a confidence threshold for validity 
of the profile of the user; 

d) tracking each time the user retrieves and reads a document 
outside of the most frequently accessed clusters; and 

e) establishing a misuse threshold number for documents read 
outside of the most frequently accessed clusters and after the misuse threshold number 
is obtained, signaling that a potential misuse may have occurred. 
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23 . (Currently Amended) A method for detecting misuse by a 
user of an information retrieval system having a document collection, comprising the 
steps of: 

a) retrieving documents in response to user queries; 

b) clustering the retrieved documents by category based upon a 
content of each of the retrieved documents, wherein the content includes at least one 
of terms, phrases, and topics : 

c) establishing and obtaining a threshold number of retrieved 
documents and after the threshold number of retrieved documents is obtained, 
determining a size for each clusters, and further denoting clusters of a large enough 
size as valid clusters; and 

d) determining if a sufficient number of retrieved documents do not 
participate in any valid cluster and if not, sounding an alami signaling that a potential 
misuse may have occurred . 
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24. (Currently Amended) A method for detecting misuse by a 
user of an information retrieval system having a document collection, comprising the 
steps of: 

a) identifying top weighted terms from documents retrieved by the 
user from searches of the document collection and storing the top weighted terms in 
a user-specific lexicon; 

b) tracking user activity until the rate of new terms added slows and 
the user-specific lexicon stabilizes to form a user profile; 

identifying for each new query, if the top weighted terms are in 
the user-specific lexicon; 

d) tracking a ratio of newly occurring terms to existing user-specific 
lexicon terms; and 

c) if the ratio of newly occurring terms to existing user- specific 
lexicon terms exceeds a threshold, sending an alarm signaling that a potential misuse 
may have occurred . 

2 5 . (Currently Amended) The method for detecting misuse by 
a user of an information retrieval system having a document collection, according to 
Claim 24, further comprising t he steps of : 

a) tagging the documents to identify words in the documents by 

type; 

b) running an original query of terms and phrases; 

c) selecting specific types of words from relevant documents 
retrieved by the original query and adding these terms to a second query; and 
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d) iteratively selecting specific types of words fi-om relevant 
documents retrieved by each query and adding the selected specific types of words to 
a further query to filter the user-specific lexicon. 

26. (Currently Amended) A method for detecting misuse by a 
user of an information retrieval system having a document collection, comprising the 
steps of: 

a) identifying structured data sources that can be used to identify 
what the user is working on; 

b) querying these sources and, for each source, mapping a 
structured result into a structured data lexicon of terms and phrases that indicate valid 
user activity; 

c) for each new query, tracldng a ratio of terms found in the 
structured data lexicon to those not found in the structured data lexicon; and 

d) if the ratio exceeds a threshold, sending an alann signaling that 
a misuse may have occurred. 
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27. (Currently Amended) A method for detecting misuse by a 
user of an information retrieval system having a document collection, comprising the 
steps of: 

a) identifying structured data sources that can be used to identify 
what the user is working on; 

b) querying the identified structured data sources and, for each 
source queried, mapping a structured result into a stmctured data lexicon of terms and 
phrases that indicate valid user activity; 

c) for each new query, retrieving relevant documents for that new 

query; 

d) extracting key terms from the relevant documents; 

e) identifying the ratio of key retrieved terms found in the lexicon 
to those not found in the lexicon; and 

f) if the ratio exceeds a threshold, sending an alarm signaling that 
a misuse may have occurred. 
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28 . (New) A method for identifying a misuse of an authorized 
user of an information retrieval system, the method comprising: 

monitoring a content of a plurality of at least one of queries entered by 
the user and digital data gathering results obtained by the user, wherein the content 
includes at least one of terms, phrases, and topics; 

constructing a profile of use for the user using the content; 

monitoring a further content of at least one of a further query entered 
by the user and further digital data gathering results obtained by the user; 

comparing the further content to the profile of use to determine whether 
the at least one of the further query entered by the user and the further digital data 
gathering results is an anomaly; 

identifying a potential misuse when an anomaly is detected. 

29. (New) The method according to Claim 28, wherein the 
profile of use comprises a user lexicon of user result words or phrases, a user cluster 
index of result document topic categories, and a structured data profile of known user 
characteristics, and further comprising: 

comparing the further content to each of the user lexicon of user result 
words or phrases, the user cluster index, and the structured data profile. 



IIT-171 



15 



MDS/l 



